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Abstract — One  of  the  instruments  on  the  sun-synchronous  Terra 
(EOS  AM)  and  Aqua  (EOS  PM)  spacecraft,  the  Moderate  Reso¬ 
lution  Spectroradiometer  (MODIS),  obtains  calibration  data  once 
during  every  orbit.  Observations  of  the  sun  permit  corrections  to 
observations  of  the  earth  during  the  ensuing  orbit.  Although  the  in¬ 
strument  was  designed  to  receive  uniform  sunlight  over  the  entire 
surface  of  its  detector,  the  sunlight  was  in  fact  not  uniform.  While 
this  did  not  adversely  affect  the  calibration,  it  nonetheless  implied  a 
lack  of  understanding  of  how  the  optical  system  really  functioned. 
To  learn  what  was  wrong,  NASA  used  an  optical  ray-tracing  pro¬ 
gram  on  a  DEC  Alpha  computer.  The  results  correlated  well  with 
the  observations  made  by  the  instrument  itself,  but  it  took  nearly 
two  weeks  to  complete  the  computer  simulation,  a  discouragingly 
long  time.  This  paper  describes  the  algorithm  and  its  implementa¬ 
tion  in  a  system  with  multiple  digital  signal  processor  (DSP)  chips 
operating  in  parallel.  Timing  data  show  a  highly  linear  relationship 
between  the  number  of  DSPs  present  and  the  speed  of  the  compu¬ 
tation.  Administrative  overhead  is  negligible  compared  to  the  time 
taken  to  compute  ray  trajectories.  This  implies  that  many  more 
than  just  four  DSPs  could  be  harnessed  before  administrative  over¬ 
head  would  begin  to  be  significant. 

Index  Terms — Digital  signal  processor  (DSP),  optical  ray  tracing, 
optics,  parallel  processing,  reconfigurable  computer  (RC),  recon- 
figurable  computing. 


I.  Introduction 

IN  ORDER  to  speed  up  optical  ray  tracing  simulations,  the 
authors  investigated  the  implementation  of  ray-tracing  algo¬ 
rithms  in  a  PC-based  system  with  multiple  digital  signal  pro¬ 
cessor  (DSP)  microprocessors  within.  Doing  these  computa¬ 
tions  took  about  two  weeks  on  a  particular  uniprocessor  system. 
This  paper: 

1)  explains  the  mathematics  associated  with  optical  ray 
tracing; 

2)  explores  the  use  of  multiple  coordinate  systems  to  sim¬ 
plify  such  mathematics; 

3)  discusses  how  we  divided  the  work  up  into  tasks  which 
can  be  executed  in  parallel  on  multiple  processors;  and 

4)  presents  measurements  showing  how  speed  of  processing 
increases  nearly  linearly  with  an  increase  in  the  number 
of  processors  available. 

Rather  than  simply  use  a  supercomputer,  we  investigated  the 
use  of  low-cost,  reconfigurable  computer  (RC)  architectures  and 
digital  signal  processors. 


Fig.  1.  Spherical  Lens. 


Notes  on  the  mathematical  notation  used  appear  in  the  Ap¬ 
pendix. 


II.  Optics 

Ray  tracing  is  one  of  the  principle  tools  for  an  optical  system 
designer  [1],  Computer-based  ray  tracing  has  a  history  going 
back  for  more  than  40  years.  An  early  paper  presenting  a  uni¬ 
fied  ray-tracing  procedure  that  is  applicable  to  optical  systems 
with  very  general  kinds  of  surfaces  is  Spencer  and  Murty  [2], 
Their  procedures  rely  on  matrix  methods  offering  a  compact  no¬ 
tation.  While  some  other  authors  also  use  matrix-based  methods 
[3],  others  avoid  them  [4],  [5].  Of  those  using  matrix-based 
methods,  some  use  two-dimensional  (2-D)  matrix  methods  [6] 
while  others  use  three-dimensional  (3-D)  matrix  methods  [7], 
In  this  section,  we  summarize  the  3-D  matrix-based  equations 
for  optical  reflection  and  refraction  of  rays  impinging  on  various 
simple  geometric  surfaces  as  developed  in  [7],  The  notation  is 
somewhat  simplified  here  for  greater  clarity. 

A.  Optical  Surfaces 

While  in  principle,  lenses  can  be  made  with  any  arbitrary  sur¬ 
face,  for  simplicity,  we  usually  consider  only  simple  geometric 
shapes:  planes  and  conicoids  (spheres,  paraboloids,  ellipsoids, 
and  hyperboloids).1 

1 )  A  Spherical  Lens:  Fig.  1  shows  a  crosssection  of  a  sphere. 
Not  shown  are  the  x-  and  y- axes,  orthogonal  to  the  2-axis  and  to 
each  other.  This  sphere  may  be  regarded  as  a  lens  whose  vertex 
is  at  point  (0, 0, 0)  and  which  is  centered  around  the  positive 
2-axis.  For  such  a  spherical  lens  of  radius  R 

R?  =  x2  +  y2  +  (z  -  R)2.  (1) 
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If  we  define 

r2  =  x2  +  y 2  (2) 

'Paraboloids,  ellipsoids,  and  hyperboloids  are  the  surfaces  described  by 
parabolas,  ellipses,  and  hyperbolas  rotated  about  an  axis  of  symmetry.  In  the 
case  of  hyperbolas,  the  axis  is  usually  considered  to  be  the  longitudinal  or 
semimajor  axis.  In  the  case  of  ellipses,  either  axis  may  be  considered. 
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Fig.  2.  Prolate  Ellipsoid.  The  semimajor  axis  is  a,  the  semi  minor  axis  is  6,  and 
the  eccentricity  is  e. 

and  let  the  curvature  c  =  1/R,  then  it  can  be  shown  that  for  a 
sufficiently  small  value  of  r 


I  n  other  words,  the  surface  of  a  sphere  can  be  approximated  by 
a  parabolic  surface.  This  approximation  is  only  reasonable  if 
r  <  R,  which  is  the  case  for  a  typical  spherical  lens. 

2 )  Prolate  Ellipsoid  Lens  and  Other  Conicoid  Lenses:  F  i  g .  2 
shows  a  cross  section  of  a  prolate  ellipsoid2  with  eccentricity  e, 
semi-major  axis  a,  and  semi-minor  axis  b.  Variable  r  is  given 
by  r2  =  x2  +  y2  as  in  (2). 

An  equation  of  this  ellipsoid  is 


(z  —  a)2  r2 

—  +  =1 
a 2  +  b 2 

(4) 

We  can  define  the  eccentricity  eof  the  ellipse  implicitly  by 

b2  =  a2(  1-e2) 

(5) 

and  if  we  define 

e  =  1  —  e2 

(6) 

and 

1 

c  =  — 
ae 

(7) 

then  it  can  be  shown  that  the  least  positive  position  z  (leftmost 
in  Fig.  2)  at  which  the  surface  of  the  ellipsoid  is  a  radiusr  from 
the  ^-axis  is  given  by 

2 

crz 

1  +  \Jl  —  e(cr)2 

(8) 

If  we  define  the  conic  constant 

b2  o 

k  =  £  —  1  =  —  —  1  =  —e2 
a 1 

(9) 

then  we  can  rewrite  (8)  as 

2 

cr 

1  +  \J\  —  (1  +  k)c2r2 

(10) 

Itcan  be  shown  (see  [7])  that  this  equation  isapplicablenotjust 

for  prolate  ellipsoids  but  for  all  conicoids.  It  can  be  expanded  in 
a  power  series  usi  ng  a  Tay  I  or  series  about  r  =  0,  gi  vi  ng  the  same 
result  given  earlier  in  (3)  fora  spherical  surface.  In  other  words, 
a  parabolic  surface  can  approximate  both  a  spherical  surface 
and  a  prolate  ellipsoidal  surface.  The  table  provided  in  [7]  and 

2Prol ate  with  respect  to  the  optical  axis  z. 


TABLE  I 

Parameters  of  the  Conicoids 


Eccentricity 

Conic  Constant 

£ 

Conic  Section 

k  >  0 

£  >  1 

oblate  ellipsoid 

e  =  0 

k  =  0 

e  =  1 

sphere 

0  <  e  <  1 

-1  <  k  <  0 

0  <  £  <  1 

prolate  ellipsoid 

e  =  1 

k  =  - 1 

£  —  0 

paraboloid 

e  >  1 

k  <  —  1 

£  <  0 

hyperboloid 

Fig.  3.  Intersection  of  a  line  with  a  plane. 


reproduced  hereasTable  I  shows  how  to  pick  the  conic  constant 
k  for  various  conicoids. 


B.  Ray  Intersections 

To  trace  a  ray’s  path  requires  the  following  two  steps. 

1)  Find  the  point  where  a  ray  intersects  an  optical  surface. 

2)  Find  the  direction  the  ray  takes  after  striking  the  surface. 

In  this  section,  we  consider  the  first  of  these  problems  for  var¬ 
ious  shapes  of  optical  surfaces. 

1)  Intersection  of  a  Line  With  a  Plane:  Consider  Fig.  3.  It 
shows  a  ray  departing  initial  point  P0  with  direction  vectora>Po. 
Unless  it  is  parallel  to  (or  pointing  away  from)  the  plane  de¬ 
picted,  it  will  strike  the  plane  at  point  P,  whose  coordinates  we 
wish  to  determine.  The  plane  can  be  defined  by  its  unit-length 
normal  aN  and  a  particular  point  Pi  in  the  plane.  It  can  be  shown 
that 


P  =  Po  + 


a-N  ■  (Pi  ~  Pq ) 
o-n  •  &pa 


“Pa- 


(ID 


If  P0  is  already  in  the  plane,  then  aN  ■  (Pi  -  P0)  =  0  and 
obviously  in  this  case  P  =  P0.  If  the  ray  is  parallel  to  the 
plane,  then  aN  ■  CjPo  =  0,  resulting  in  division  by  zero  if  (12) 
is  applied.  A  computer  program  must  check  for  this  condition 
before  applying  the  equation. 

I  n  the  special  case  where  we  choose  our  coordinate  system  so 
that  Px  =  (0, 0, 0),  we  can  write  (11)  as 

P  =  Po~  aN'P°up0.  (12) 

«7V  -  WP0 

It  should  be  noted  that  even  if  the  ray  is  pointing  away  from 
the  plane,  (11)  and  (12)  will  find  the  intersection  of  the  line 
(col I  inear  with  the  ray)  and  the  plane.  A  computer  program  must 
check  to  see  if  this  is  the  case. 
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Fig.  4.  Intersection  of  a  line  with  a  conic  surface. 


2)  Intersection  ofa  Line  With  a  Conic  Surface:  From(2)and 
(10),  a  general  conic  surface,  a  conicoid,  can  be  described  by  an 
equation  of  the  form 


g(x,y,z)  =  z  - 


c(x 2  +  y2) 

1  +  yjl  —  (1  +  k)c2(x2  +  y2) 


(13) 


Such  a  surface  and  a  line  originating  at  point  P0  and  intersecting 
the  surface  are  shown  in  Fig.  4.  The  direction  of  the  line  is  given 
by  the  unit  vector  CjPo  =  Lux  +  Muy  +  Nuz. 

Wecanfollow  an  iterative  procedure  as  indicated  i  n  the  figure 
to  discover  the  point  where  the  line  intersects  the  surface.  We  do 
this  by  finding  a  series  of  points  of  intersection  between  the  line 
and  surfaces  tangent  to  the  conic  surface. 

Although  Fig.  4  implies  that  they-axis  does  not  intersect  the 
conic,  we  are  free  to  define  any  coordinate  system  we  like.  It  is 
often  convenient  to  choose  the  xy- plane  as  being  tangent  to  the 
vertex  of  the  conic  surface.  If  we  do  this,  then  the  point  Px  = 
(0,0,0)  becomes  a  known  point  in  the  first  plane  considered. 
This  plane  has  normal  dN1  =  iiz.  Using  (11),  we  can  find  the 
point  P'x  where  the  line  intersects  this  plane. 


p,  =  Po  +  a^-tP.-Po)  (14l 

«JV1  •  Wp0 

in  particular,  we  can  find  the  values  of  x  and  y  of  the  point 
of  intersection.  (In  the  figure,  x  =  0,  but  in  general,  it  could  be 
anything.) 

Next,  we  find  a  point  P2  with  the  same  values  of  x  and  y 
but  lying  on  the  conic  surface.  This  point  satisfies  the  equation 
g(x,y,z )  =  0,  so  it  has  the  coordinates 


P2  = 


c(x'12  +  y'1z) 


V  1  +  y  1  -  (!  +  k)°2  (4 2  +v'i)  j 


(15) 


After  this,  we  find  the  plane  tangent  to  the  conic  surface  at 
point  P2.  This  requires  finding  the  partial  derivatives  of  g  and 
evaluating  them  at  point  P2.  If  we  define 


a  =  sjl  —  (1  +  k)c2  ( x 2  +  y 2)  (16) 

b=  ^  +  2  (2a(l  +  a)  +  c2(l  +  k){x 2  +  y2))  (17) 

and  _ 

d=sJ(l/b)‘2  +  x2  +  y2  (18) 


then  it  can  be  shown  that  the  direction  cosines  of  the  normal  to 
the  tangent  plane  are 


X2 

a2--- 

(19) 

1 

II 

<20. 

(20) 

'a  =  +h 

(21) 

and  the  unit-length  normal  to  the  tangent  plane  at  point  P2  is 
given  by 


«7V2  =  Ot2Ux  +  p2  Uy  +  72M2.  (22) 

This  procedure  can  be  repeated:  from  the  intersection  of  the 
extended  ray  with  a  plane  tangent  to  theconic,  wecan  determine 
a  nearby  point  on  the  conic,  and  from  this  we  can  determine  a 
new  tangent  plane.The  entire  procedure  is  repeated  until  succes¬ 
sive  points  are  close  enough  together  to  be  regarded  as  equiva¬ 
lent. 

Fig.  5  summarizes  the  steps  in  finding  the  intersection  be¬ 
tween  a  line  and  a  conic  surface. 

3)  Intersection  of  a  Ray  and  a  Spherical  Surface:  A  sphere 
is  a  special  case  of  a  conic,  of  course,  but  there  is  a  simpler 
method  of  obtaining  the  point  of  intersection  of  a  ray  and  a 
spherical  surface,  one  that  avoids  the  iteration  shown  above. 
Fig.  6  shows  a  ray  departing  initial  point  P0  with  direction 
vector  wp0  and  striking  point  P.  It  can  be  shown  that  the  steps 
in  Fig.  7  let  us  find  the  coordinates  of  point  P  and  the  sphere’s 
centrally  directed  normal  aN  at  this  point. 

C.  Reflections 

Now  that  we  know  how  to  discover  the  point  on  a  plane  or  a 
conic  surface  where  a  ray  strikes  it,  we  can  turn  our  attention 
to  what  happens  to  the  ray  next,  in  a  typical  optical  system, 
part  of  the  ray  is  reflected  and  part  is  refracted,  but  this  is  a 
complication  which  weshall  neglect.  Fora  reflective  surface,  we 
shall  disregard  the  refracted  portion,  and  for  a  refractive  surface, 
we  shall  disregard  the  reflected  portion.3 

1 )  Reflection:  Fig.  8  shows  a  unit-length  ray  CjPq  origi¬ 
nal  ng  at  i  ni ti al  poi  nt  P0  and  i  ncident  on  a  surface  S  at  poi nt  O. 
Its  reflection,  unit-length  ray  tflPl,  terminates  at  point  Pi.  The 

including  these  contributions  would  greatly  increase  the  number  of  rays 
traced  but  would  increase  the  accuracy  of  the  model. 
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For  a  ray  with  direction  u)p0  —  Lux  +  Muy  +  Nil .  originating  at  point 
Po  =  (.To,  2/o,  zo)  set  Pi  =  0,  d,v,  =  d=,  j  =  1. 

1.  Calculate 


P'  =  Po 


o \  ,  ■  {Pj  P o) 

djVj  ■  uPo 


“Pr 


(from  (11)) 


2.  Set  j  =  j  +  1. 

3.  Set  point 


Pi  = 


c  +  y.7-i2) 


i  +  y  i  -  (i  +  ^)c2  +  y'j-i 2)  / 


4.  Calculate 


6  = 


a  (1  -f  a)‘ 


5.  Calculate 


X  j  n  Vi  1 

aj~~~d’  7j  “  +M- 


(from  (15)) 


a  =  \Jl  -  (1  +  fc)c2  (x2j  +  ! q)  (from  (16)) 

(2a(l  +  a)  +  c2(l  +  k)  (x2  +  j/2))  (from  (17)) 


d  =  \J{l/b)2  +Xj  +  i/2.  (from  (18)) 


(from  (19),  (20),  (21)) 


6.  Let  d  v,  =  Oj u.  +  ,3jUy  +  z, u , . 

7.  Repeat  until  P7  and  P(  differ  by  an  insignificant  amount. 

Fig.  5.  Summary:  Finding  the  intersection  of  a  ray  and  a  conic  surface. 


F  i  g .  8 ,  R  eflecti  on  of  a  ray  at  a  su  rf  ace. 


TV 


Fig.  6.  Ray  tracing  for  a  spherical  lens. 


u>p0  =  Lax  +  May  +  Naz  (23) 

c  =  l/R  (24) 

F  =  c  (x\  +  y\  +  2^)  —  2za  (25) 

G  =  N  —  c  (auL  +  i/^M  +  zaN)  (26) 

p 

A  =  - ;■  .  (27) 

G  ±  VG2  -  cF 

(Select  the  root  which  yields  A  with  minimum  magnitude.) 

P  =  Po  +  A  •  Wp0  (28) 


oat  =  ax  -  cys  ay  +  (1  -  czB)az  (29) 

Fig.  7.  Summary:  Steps  to  find  the  where  a  ray  strikes  a  spherical  surface  and 
the  normal  at  that  point. 


Fig.  9.  Vector  form  of  Sneii’s  Law. 

angle  of  incidence  and  the  angle  of  reflection  both  are  </>.  Given 
the  centrally  directed  unit-length  normal  aN  to  the  surface,  it 
can  be  shown  that 

<l)p1  =  u)p0  —  2(6 bp0  ■  d]v)a .jv-  (30) 

2)  Refraction:  F  ig.  9  shows  a  I  i  ght  ray  £2?^  travel  i  ng  through 
a  medium  with  index  of  refraction  n.  The  ray  passes  through  a 
surface  into  a  different  medium  with  index  of  refraction  n'  with 
angle  of  incidence  <j>.  The  ray  emerges  as  wPl  from  the  surface 
at  a  different  angle,  <fj,  which  we  desire  to  calculate  from  n,n', 
and  4>. 

U  nit  vector  dN  gives  the  direction  of  the  normal  to  the  sur¬ 
face,  coplanar  with  ljPo  and  6jPi,  and  pointing  into  the  new 
medium;  and  dT  is  a  unit  vector  tangent  to  the  surface  such  that 

Cjp0  =  cos  <j)a.N  +  sin  4>clt  (31) 

cjPl  =  cos (fi'ajj-  +  sin  (f/ap-  (32) 

With  these  definitions,  it  can  be  shown  that 

in  f  ti  \ 

Ujp,  =  —  6jpn  +  d.N  ( cos  (j) - 7  cos  (j)  .  (33) 

rv  \  n!  J 

3)  Efficient  Computation:  We  would  like  to  be  able  to  com¬ 
pute  wPl  from  (33)  efficiently  if  we  are  going  to  compute  it 
repeatedly.  Calculating  any  trigonometric  function  is  time-con¬ 
suming  so  we  would  prefer  to  avoid  it,  if  possible. 
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It  can  be  shown  that  if  we  let  q  =  n/n',  then 

cos <j)  =  ajsi  ■  Q)p0  (34) 

cos  <j}'  =  ±y/l  —  q2(  1  —  cos2  (f>).  (35) 

Therefore,  it  suffices  to  compute 

Q)p1  =  qu)p0  +  a;\T  (cos  <j>  —  qcostj))  (36) 

with  cos <f>  given  by  (34)  and  cos<t>'  given  by  (35).  Although 
we  still  must  compute  a  square  root,  this  ordinarily  requires 
considerably  less  time  than  computing  trigonometric  functions. 
This  is  especially  true  with  our  hardware  setup  because  the 
AD21160  contains  special  instructions  to  reduce  the  time  to 
compute  square  roots. 

Should  we  choose  the  positive  or  the  negative  square  root? 
We  should  pick  it  so  the  arithmetic  sign  of  cosy/  is  the  same 
as  that  of  cos 4>.  Commonly,  <j>  <  7r/2  and  <j>'  <  7r/2,  so  both 
trigonometric  functions  are  positive. 

III.  Converting  Between  Different  Coordinate  Systems 

Ray  tracing  with  respect  to  a  particular  surface  is  simplified 
if  all  computations  are  performed  in  a  local  coordinate  system, 
in  coordinate  system  k  associated  with  surface  k,  point  PA  is 
represented  by  vector  rAk,  point  PB  is  represented  by  vector 
rBk ,  and  the  ray  departing  point  A  \su)Ak.  In  a  system  with  n 
surfaces,  the  procedure  for  each  surface  k  e  [0,n  -  1]  is  as 
follows. 

1)  Find  the  point  PB  on  surface  k  struck  by  the  ray  CjAk  orig- 
i  nati  ng  at  poi  nt  PA.  If  the  ray  does  not  stri ke  the  surface,  it 
can  be  discarded.  This  is  tantamount  to  disregarding  other 
forms  of  internal  ray  propagation,  such  as  scattering. 

2)  Find  the  ray  wBk  resulting  from  the  interaction  of  the  ray 
and  the  surface,  whether  due  to  reflection  or  to  refraction. 

3)  C  onvert  vectors  rBk  and  wBk  from  the  coordi  nate  system 
of  optical  medium  k  (applicable  before  interaction  with 
surface  k)  to  the  equivalent  vectors  rA,k+1  and  £jAjk+i  of 
optical  medium  k  +  1  (applicable  after  interaction  with 
surface  k). 

To  carry  out  these  steps  requires  knowing  how  to  do  the  coordi¬ 
nate  conversions. 

Suppose  the  origin  of  local  frame  k  is  located  in  frame  0 
(the  global  frame)  at  point  Tfc,  represented  by  a  vector  tk  = 
Sxkux  +  Sykiiy  +  Szkuz.  To  convert  vectors  expressed  in  local 
frame  k  to  equivalent  vectors  in  the  global  reference  frame, 
frame  0,  requires  first  rotating  them  through  angle ^k  around  the 
-fz-axis,  through  angle -f3k  around  they-axis,  and  last,  through 
angle  -ak  around  the  ar-axis;  and  then  translating  them  to  re¬ 
flect  the  fact  that  the  two  frames  do  not  share  the  same  origin.4 

A  general  point  P  can  be  represented  in  local  frame  k  by  a 
vector  rk  =  xkux  +  ykuy  +  zkuz.  The  same  point  can  be 


4A  positive  angle  is  that  required,  say,  to  rotate  iix  to  align  with  uy  in  a  right- 
handed  system.  The  conventional  sequence  of  rotations  described  above  is  not 
followed  universally,  it  is  used  in  the  treatment  by  Kidger  [7]  summarized  here 
and  is  also  used  in  the  commercial  ray-tracing  program  CODE  V  from  Optical 
Research  Associates.  It  is  essential  to  be  cl  ear  about  the  convention  in  use  before 
the  mathematics  will  make  any  sense  at  all. 


represented  in  frame  0  (the  global  frame)  by  a  vector  r0  = 
AjcVfc  +  tk  where  matrix  A/1  is  given  by 

(10  0  \ 

Aj.1  =  0  cos  ak  sin  ak 

y  O  —  sin  ak  cos  ak  ) 

cos  (3k  o  —  sin  f3k  \ 

0  10 

sin  pk  0  cos  /3k  ) 

cos7fc  —  sin  0\ 

sin  7fc  cos7fc  0  . 

0  0  1 J 

Now  consider  another  frame  of  reference,  frame  k  +  1, 
located  in  frame  0  at  point  Tfc+1,  represented  by  a  vector 
tkp i  —  ilx  +  S y^kAkUy  -p  &z,k-{-i'u,z ■  To  convert  vectors 
expressed  in  the  global  reference  frame,  frame  0,  to  equivalent 
vectors  in  local  frame  k  +  1  requires  first  translating  them  to 
the  origin  and  then  rotating  them  through  angle  ak+1  around 
the  z-axis,  through  angle  pk+1  around  the  y- axis,  and  finally 
through  angle  -7fc+1  around  the  -t-z-axis 

ffc+i  =  Afc+i(r0  -  ijfc+i ) 

—  Afc-j-l  ( A^,  Tk  ~\~  tk  tk-\-l ) 

=  (Afc+iA,.  l)rk  +  Afc+i  (tk  —  tk. |_i) 

^fc+i  =  Afc+i;ferfc  +  tk+i,k  (37) 

where 

/  cos  7fc+i  sin  7fc+i  0\ 

Afc+i  =  —  sin  7fc+!  cos7fc+i  0 

0  0  1/ 

cos  pk+i  o  sin  /3fc+i  \ 

0  10 
—  sin  /3fc+i  0  cos/3fc+i  / 

(10  0  \ 

•  0  cosafe+i  -smcLfc+i  (38) 

\0  sinafc+i  cosafc+i  / 

Afc+i,it  =  Afc+iA^1  (39) 

and 

<fc+i,fc  =  Afc+i(tfc  —  tk+ 1).  (40) 

Using  (37),  we  can  calculate  the  vector  representation  rk+1  of 
point  P  in  frame  k  +  1  if  we  know  its  vector  representation  rk 
in  frame  k.  Since  all  rotation  angles  are  fixed  in  a  given  lens 
system,  the  matrices  Afc+i,fc  and  the  vectors  tk+i  can  be  com¬ 
puted  in  advance — they  need  not  be  recomputed  on  the  fly  each 
time.  This  means  that  ray  tracing  amounts  to  a  sequence  of  ma¬ 
trix  multiplications,  vector  additions,  and  calculations  of  inter¬ 
section  points  as  well  as  reflections  or  refractions. 

IV.  Ray  Tracing  in  the  MODIS  Instrument 

This  section  explains  why  standard  ray-tracing  algorithms 
and  software  products  are  not  appropriate  for  NASA’s  purposes. 

A.  MODIS  Optical  Subsystem 

Fig.  10  shows  how  light  reaches  the  optical  detector  in  the 
MODIS  instrument.  Rays  of  lightfrom  the  disk  of  the  sun  first 
strike  an  attenuator.  The  purpose  of  the  attenuator  is  to  reduce 
the  optical  power  of  the  sun's  rays  from  25  W  to  about  8.5%  of 
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Detector 


Fig.  10.  M  ODIS  optical  system  on  theTerra  and  Aqua  spacecraft. 


-  sin  ( Ze )  sin  ( Az )  - 


Fig.  11.  Direction  cosines  of  a  unit-length  ray  from  the  sun  to  M  ODIS. 

this,  or  around  2  W,  in  order  to  perform  instrument  calibration 
without  damaging  the  detector. 

Fig.  11  shows  a  unit-length  ray  us  from  the  sun  arriving  at 
the  M  ODIS  instrument.  The  ray  makes  a  zenith  angle  Ze  with 
the  z-axis  of  the  M  ODIS  instrument.  The  projection  of  the  ray 
onto  the  xy-plane  makes  an  azimuth  angle  Az  with  the  rc-axis. 
In  Fig.  11,  the  ray’s  projections  on  there-,  y-,  and  z-axesall  are 
shown.  The  ray  can  be  decomposed  in  theframeof  reference  of 
the  M  ODIS  instrument 

COS  -  02,. ?A;,  -)-  UJyUy  LOzUz 

=  —  sin  (Ze)  cc >s(Az)ux  —  sin  (Ze)  sin(Az)uy 
+  cos  (Ze)iiz  (41) 

wher eux,uy,  and  uz  are  unit  vectors  in  then;-,  y-,  and  z-direc- 
tions,  respectively.  The  quantities  ujx,ojy,  and  ujz  are  the  direc¬ 
tion  cosines  of  the  ray.  At  the  time  a  calibration  is  performed, 
we  take  the  zenith  angle  Ze  =  77.521°  and  the  azimuth  angle 
Az  =  -24.560°.  These  are  the  correct  values  as  Terra  passes 
over  the  north  pole  once  each  orbit  in  its  sun-synchronous  orbit. 

The  ray  then  passes  through  the  numerous  pinholes  of  the 
attenuator,  a  perforated  metal  screen,  as  described  by  Waluschka 
etal.  [8],  These  pinholes  each  have  a  radius  of  1  mm  and  their 
edges  are  tapered  to  reduce  the  effects  of  the  thickness  of  the 
screen.  The  pinholes  are  arranged  in  a  grid  of  scalene  triangles 
in  order  to  provide  even  illumination  through  the  surfaces  which 
follow  the  rotating  mirror. 

To  be  more  precise,  a  cone  of  rays  passes  through  each  pin¬ 
hole,  due  to  the  fact  that  the  sun  is  not  a  point  source  but  has  a 
disk  of  apparent  half-angle^  =  0. 25°[9j. 5  Despite  thefact  that 
each  pinhole  has  a  sizable  diameter,  we  shall  treat  it  as  a  point 
source  of  a  conical  fan  of  rays.  We  shall  also  neglect  the  fact 
that  the  intensity  of  the  incident  light  is  diminished  by  its  being 
spread,  since  all  pinholes  result  in  a  comparable  diminution  of 
intensity. 

As  described  in  [8],  each  cone  of  light  from  a  pinhole  in  the 
screen  strikes  the  diffusing  surface.  Each  ray  within  each  cone, 

5A  more  accurate  value  is  9S  =  (1919  in/2)  =  0.2665* 


then,  gives  rise  to  another  cone  of  rays  from  the  diffusing  sur¬ 
face.  We  shall  treat  this  diffusing  surface  as  a  Lambertian  sur¬ 
face,  meaning  that  its  radiance  is  independent  of  direction.  Each 
diffusely  reflected  ray  can,  therefore,  be  regarded  as  having  the 
same  radiance.  Consequently,  we  need  not  keep  track  of  the  ra¬ 
diance  associated  with  each  ray. 

After  departing  the  diffusing  surface,  the  rays  enter  an  optical 
system.  It  consists  of  an  initial  reflecting  plane  mirror  which  can 
be  rotated  out  of  the  way  when  images  of  earth  are  taken.  Fol¬ 
lowing  it  are  a  series  of  reflecting  and  refracting  surfaces.  All 
told,  there  are  29  surfaces  preceding  and  following  the  mirror. 
They  are  numbered  Sk  where  k  e  [0,1V  -  1].  Surface  1  spec¬ 
ifies  a  177.8  mm  eye  pupil  diameter.  The  attenuating  screen  is 
surface  2,  the  diffuser  is  surface  3,  and  the  mirror  is  surface  4. 
Associated  with  surface  Sk  is  the  index  of  refraction  nk  of  the 
medium  leading  up  to  the  surface.  For  simplicity,  we  regard  the 
surfaces  as  producing  no  scattering,  absorption,  or  reflection. 

The  task  of  tracing  the  rays  through  this  series  can  be  subdi¬ 
vided  into  the  foil  owing  repeated  tasks,  stated  with  reference  to 
a  unit-length  ray  (bA  in  medium  k  departing  from  point  A  and 
heading  for  the  surface  Sk+1  which  separates  medium  k  from 
medi  um  k+ 1.  A  ssume  poi  nt  A  has  coordi nates  (xAk ,  yAk ,  zAk ) 
in  the  frame  of  reference  of  surface  k.  We  can  represent  this 
poi  nt  by  the  vector  A  =  xAkuXk  +  yAkuVk  A  zAkuZk .  A  ssume 
also  that  the  unit-length  ray  Cja  can  bedecomposed  using  known 
direction  cosines  into  lja  —  cuAxkuXk  A  ^Ayk^yk  A  ^ a zk • 
(The  notation  uak  refers  to  a  unit  vector  in  the  direction  of  the 
a-axis  in  the  coordinate  system  of  surface  Sk). 

1)  Find  the  position  of  point  A  in  a  new  coordinate 
system,  that  of  surface  Sk+1.  In  the  new  coordinate 
system,  z  =  0  at  the  vertex  of  surface  Sk+i-  Point 
A  has  coordinates  (xAk+1,yAk+1,zAk+1)  in  the  new 
coordinate  system  and  can  be  represented  by  vector 

A  =  xAk+1ux^kAi  +  yAk+1a-y,k+ 1  +  2t4fc+i®z,fc+i- 

2)  Find  the  direction  cosines  of  unit-length  ray  lja  in  the 
new  coordinate  system.  The  ray  can  be  expressed  as 

W  '  =  UAxk+1UXk+1  +  U)Ayk+1Uyk+l  +  LdAzk+1UZk+1, 

where  uAak+1  is  the  direction  cosine  with  respect  to  the 
a-axis  in  the  coordinate  system  of  surface  Sk+k. 

3)  Consider  the  plane  P  passing  through  the  vertex  of 
the  surface  Sk+i-  Find  the  point  A'  in  plane  P  eventu¬ 
ally  reached  by  ray  wAk+1.  This  point  has  coordinates 
(xA>  ,yA'k  i,0)  and  can  be  represented  by  vector 
A  =  xA'k+^uXk+1  +  yA’  Uyk+ 1- 

4)  Find  the  point  B  on  surface  Sk+1  struck  by  ray  vAh+1. 
This  point  has  coordinates  (xBk+1,yBk+1,zBk+1).  The 
reason  for  finding  the  point  A'  in  plane  P  is  to  permit 
deciding  whether  or  not  the  ray  gets  through  an  aperture 
associated  with  the  surface.  The  aperture  is  regarded  as 
lying  in  the  xy-  plane.  Rays  which  fail  to  passthrough  the 
aperture  successfully  are  discarded. 

5)  Find  the  new  ray  ujb  —  ^BxkAi'u,xkAi  A  ^BykAi'^,ykAi  fi- 
A>Bzk+1uZk+1  by  using  (30)  if  this  is  a  reflecting  surface 
or  Snell’s  law  (32)  if  it  is  a  refracting  surface. 

T  he  poi  nt  B  w  ith  coordi  nates  ( xBk+1 ,  yBk+1 ,  zBk+1 )  and  new 
ray  CjBk+1  in  the  coordinate  system  of  surface  Sk+1  together 
represent  the  starting  position  and  direction  of  a  ray  which  now 
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Fig.  12.  Hole  separation  in  the  M  ODIS  attenuator. 


Research  Associates.  In  this  section,  we  list  the  elements  in  that 
file  which  describe  the  optical  system. 

A.  General  Items 

Elements  applicable  to  the  entire  system  are  as  follows. 

•  Entrance  pupil  diameter:  the  diameter  of  the  initial  ele¬ 
ment  of  the  optical  system.  Light  rays  falling  outside  this 
diameter  do  not  enter  the  system. 

•  Wavelength:  Ai[A2  A3  . . .]  are  the  several  wavelengths  of 
lightfor  which  individual  indicesof  refraction  may  be  pro¬ 
vided  for  successive  lens  elements. 


heads  off  toward  surface  Sk+2.  The  steps  are  repeated  as  many 
times  as  necessary  to  bring  the  ray  to  its  final  destination  in  the 
focal  plane. 

B.  H  ole  Separation  in  the  Attenuator 

The  pattern  of  hole  separation  on  the  M  ODIS  instrument’s  at¬ 
tenuator  is  shown  in  Fig.  12.  The  attenuator  is  a  rectangle  whose 
half-width  is  w  and  whose  half-height  is  h.  The  coordinates 
of  the  center  of  the  attenuator  are  (0,0,0)  in  its  (local)  coor¬ 
dinate  system.  There  is  a  hole  at  the  center  and  the  horizontal 
spacing  between  holes  is  6.0210  mm.  The  vertical  spacing  be¬ 
tween  holes  is  6.1385  mm.  Each  row  of  holes  is  offset  in  the 
^-direction  by  -475.8  y,m  as  y  increases. 

This  pattern  can  be  used  to  calculate  the  locations  of  all  the 
holes  on  the  attenuator  by  offsetting  it  vertically  and  horizon¬ 
tally  throughout  the  extent  of  the  attenuator.  T  riangl  es  i  n  the  next 
row  up  will  be  offset  by  A  =  -475.8  ym. 

Regarding  the  Oth  row  as  the  one  in  which  the  base  of  the 
triangles  is  on  y  =  0,  the  horizontal  offset  A j  in  any  row  j  is 
A j  =  j'A,  but  since  triangles  shifted  left  or  right  by  the  hori¬ 
zontal  spacing  wr  =  6.0210  mm  are  indistinguishable,  we  can 
take  the  offset  modulo  this  amount:  A,  =  )  A  mod wr. 

Wecan  calculate  the  position  of  the  most  negative  hole  in  row 
j  by 


w  +  A,- 


Ulr 


wr  +  A  j. 


(42) 


The  number  rij  of  holes  which  will  fit  in  row  j  is  given  by 


n.j  =  1  + 


2  w  —  A,- 


(43) 


SotheAth  hole  in  row  j  (where  A  e  [0,  — 1] )  is  at  horizontal 

position 


Xjk  —  -Y:/.iriin  +  AA  j 

and  that  same  hole  is  at  vertical  position 


(44) 


B.  Optical  Surfaces  in  M  ODIS 

Elements  applicable  to  individual  optical  surfaces  in  the 
system  are  as  follows. 

•  Curvature  c  of  the  lens  element.  If  the  curvature  c  =  0, 
then  the  surface  i  s  a  pi  ane;  otherw i  se,  i  t  i  s  treated  as  a  coni  c 
surface. 

•  An  indication  showing  whether  the  surface  is  a  reflecting 
or  a  refracting  surface. 

•  T  he  indices  of  refraction  ni[n2n3  ...]  for  the  surface,  one 
for  each  wavelength  specified  for  the  system  as  a  whole. 

•  Decentering  specifications  xbsAJds,  and  zBs-  These  are 
the  x-,  y-,  and  ^-displacements  from  the  origin  of  the 
global  coordinate  system  to  the  origin  of  the  local  coor¬ 
dinate  system  of  the  optical  surface. 

•  Rotation  specifications  cursors,  and  7RS.  These  are  the 
angles  in  degrees  by  which  the  local  coordinate  system 
would  need  to  be  rotated  in  order  to  cause  it  to  align  with 
the  global  coordinate  system.  The  rotations  are  specified 
as  —  7rs  about  the  z-axis,  /3RS  about  the  y-axis,  and  aRs 
about  the  rc-axis. 

•  Rectangular  aperturedimensionsa;RA  andyRA  (if  the  sur¬ 
face  has  a  rectangular  aperture).  These  are  the  half-width 
of  the  aperture  in  the  ^-direction  and  the  half-height  of 
the  aperture  in  the  y-di  recti  on.  The  aperture  is  regarded  as 
lying  in  the  ary- pi  ane  of  the  local  coordinate  system. 

•  Circular  Aperture  rCA  (if  the  surface  has  a  circular  aper¬ 
ture).  This  is  the  radius  of  the  circular  aperture.  The  aper¬ 
ture  is  regarded  as  lying  in  the  ary-plane  of  the  local  coor¬ 
dinate  system. 

•  Conic  Constant  A.  The  meaning  of  the  possible  values  of 
k  is  given  in  Table  I. 

C.  Image  Plane 

An  element  applicable  to  the  image  plane  alone  is  as  follows. 

•  Image  plane  curvature  ciP.  For  M  ODIS,  ciP  =  0  because 
the  image  plane  is  flat. 


Vjk  =  khr.  (45) 

V.  Optical  System  Description  File 

In  our  implementation  of  ray  tracing,  we  created  an  ASCII 
file  to  describe  the  optical  system  in  the  MODIS  instrument. 
The  format  of  this  file  was  based  loosely  on  the  input  format 
of  the  commercial  ray-tracing  program  CODE  V  from  Optical 


VI.  Implementation 

We  implemented  the  system  on  a  Dell  Precision  Workstation 
530  MT  based  on  an  Intel  Xeon  CPU  operating  at  a  frequency 
of  1.70  G  FI  z.  This  system  used  a  400-M  FI  z  system  bus,  an  8-kB 
LI  cache,  and  a  256-M  B  L2  cache.  It  also  employed  a  Bittware 
Flammerhead  66-M  FI z  PCI  board  containing  four  Analog  De¬ 
vices  21160  DSPs  operating  at  80  MHz.  Our  approach  was  to 
use  the  PC  to  direct  the  operation  of  the  DSPs.  All  ray  tracing 
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was  performed  using  the  mathematical  models  presented  in  Sec¬ 
tions  II  and  III.  The  the  following  are  two  main  parts  to  the  ray 
tracing  program,  one  performed  by  the  PC  and  the  other  per¬ 
formed  by  the  DSPs. 

PC  Task)  Trace  21  rows  x  21  columns  =  441  rays  from 
each  of  485  pinholes  in  the  attenuating  screen  to 
the  poi  nt  w  here  they  stri  ke  the  di  ff  user.  T  here  are 
213  885  such  points. 

DSP  Task)  Trace  241  rows  x  121  columns  =  29  161  rays 
from  each  of  these  diffuser  locations  to  where 
the  rays  strike  the  focal  plane,  if  they  do  indeed 
do  so.  (If  they  do  not  pass  through  all  apertures, 
then  they  fail  to  reach  it.) 

For  the  PC  to  dispatch  DSP  Tasks,  we  used  C-language  library 
routines  provided  by  Bittware.  These  permit  a  C  program  run¬ 
ning  on  a  PC  to  download  programs  to  each  DSP  individu¬ 
ally.  They  also  permit  the  PC  program  to  modify  and  read  the 
memory  associated  with  each  DSP. 

To  synchronize  the  DSPs  with  the  PC,  we  implemented  a  full 
handshaking  scheme.  In  this  scheme,  each  DSP's  memory  con¬ 
tained  two  flags,  DSPReady  and  PCReady.  The  sequence 
was  as  follows. 

1)  After  a  DSP  finished  a  DSP  Task,  it  needed  to  pass  its 
results  to  the  PC  and  obtain  a  new  assignment.  To  do 
so  it  would  set  DSPReady  <-  True  and  idle  while 
PCReady  remained  False. 

2)  When  the  PC  discovered  this  (by  polling),  it  would  set 
PCReady  <—  True.  At  this  point,  it  was  free  to  re¬ 
trieve  the  results  of  the  last  DSP  Task  (if  any)  and  store 
parameters  i  n  the  D  SP  ’s  memory  descri  bi  ng  the  next  task. 
These  parameters  amounted  to  a  point  on  the  diffuser  sur¬ 
face  from  which  to  start  tracing  rays. 

3)  Meanwhile,  the  DSP  set  DSPReady  <—  False  and 
idled  as  long  as  PCReady  was  True. 

4)  On  finishing  its  work,  the  PC  would  set  PCReady  <— 
False.  When  the  DSP  noticed  this,  it  was  free  to  start  its 
next  task  and  the  PC  would  continue  scanning  for  DSPs 
ready  for  new  work  assignments. 

We  put  identical  ray-tracing  programs  in  all  four  DSPs  on  the 
Hammerhead  board  and  performed  tests  using  one,  two,  three, 
and  all  four  of  these. 

Having  determined  a  starting  location  for  a  DSP  Task,  the  PC 
looked  for  an  idle  DSP.  U  pon  finding  one,  it  would  retrieve  the 
results  from  that  DSP’s  last  task  and  dispatch  a  new  task  to  it. 
The  image  plane  of  the  optical  system  was  divided  up  into  a 
rectangular  grid  of  cells.  The  result  of  one  DSP  Task  was  a  2-D 
histogram  showing  the  number  of  times  a  given  cell  was  struck 
by  all  the  rays  in  that  task.  The  PC  would  accumulate  the  sum 
of  all  such  histograms,  yielding  a  composite  indication  of  the 
number  of  rays  striking  any  cell.  It  would  then  resume  scanning 
of  the  available  DSPs. 

The  overall  ray-tracing  scheme  has  not  yet  been  fully  com¬ 
pleted.  We  have  programmed  the  PC  to  assign  tasks  to  the  DSPs 
and  the  DSPs  perform  their  tasks  correctly.  We  consequently 
have  been  able  to  measure  the  time  it  takes  to  trace  rays  through 
the  optical  system.  From  this,  we  have  been  able  to  determine 


TABLE  ii 

Time  to  Trace  349  932  Rays  on  Varying  Numbers  of  DSPs 


Number 

Elapsed 

Time  per 

Expected 

of 

Time 

Ray 

Completion 

Processors 

(s) 

(fls) 

Time  (days) 

1 

122.861 

351.6 

100.8 

2 

61.449 

175.8 

50.4 

3 

40.942 

117.2 

33.6 

4 

30.721 

87.25 

25.2 

the  increase  of  speed  obtained  by  adding  additional  DSPs,  as  de¬ 
scribed  below.  However,  the  PC  Task  to  determine  all  213  885 
rays  striking  the  diffuser  surface  remains  to  be  completed. 

VII.  Performance 

A.  Baseline  Performance 

The  algorithms  described  in  this  paper  were  implemented 
several  years  ago  in  FORTRAN  and  executed  on  a  DEC  Alpha 
computer.  The  program  described  in  [8]  performed  ray  tracing 
for  485  pinholes  with  roughly  441  rays  emanating  from  each 
pinhole.  Each  of  these  rays  generated  a  bundle  of  29 161  rays 
reflected  from  the  diffuser,  for  a  grand  total  of  roughly  6.24  G 
rays.  It  took  the  program  around  14  days  to  complete  the  cal¬ 
culations,  an  average  rate  of  5.16  k  raytraces/s,  equivalent  to 
roughly  194  /^s/raytrace. 

B.  Performance  on  Parallel  Processors 

Table  II  gives  performance  measurements  of  the  ray  tracing 
algorithm  when  distributed  to  multiple  DSPs  executing  in  par¬ 
allel.  The  results  in  the  table  were  determined  by  repeatedly 
tracing  the  same  ray  349  932  times.  The  ray  selected  was  one 
which  traversed  all  28  surfaces  of  the  M  ODIS  optical  system 
without  missing  any  internal  aperture.  The  number  of  times  this 
ray  was  traced  matched  the  number  of  rays  in  12  complete  bun¬ 
dles  of  rays  leaving  the  same  point  of  the  solar  diffuser,  as  ex¬ 
plained  in  Section  VI.  In  practice,  numerous  rays  would  miss 
some  aperture,  cutting  short  the  time  to  trace  such  a  ray,  and  so 
increasing  the  number  of  rays  traced  per  unit  time.  The  results 
shown  here,  therefore,  are  conservative.  The  time  which  would 
be  required  to  complete  all  6.24  G  rays  is  shown  in  the  rightmost 
column.  Times  were  measured  using  theclock()  function  in  the 
C  runtime  library  and  yielded  times  to  the  nearest  millisecond. 

F  i  g.  13  graphs  the  number  of  rays  traced  per  second  as  a  f  unc- 
tion  of  the  number  of  processors  used.  A  linear  curve  fit  shows 
that  within  the  range  of  one  to  four  processors,  the  number  of 
rays  processed  per  second  r  is  related  to  the  number  of  proces¬ 
sors  n  by 

r  =  ((2847.97  ±  l.l)n  +  (0.21  ±  3.1))  rays  •  s_1  (46) 

where  the  error  bounds  are  ±1  standard  deviation. 
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Ray  Traces  as  a  Function  of  the  Number  of  Processors 


Fig.  13.  Preliminary  results  of  ray  tracing  on  the  DSP. 

There  are  several  additional  pertinent  observations  to  be 
made. 

•  It  is  apparent  that  there  is  a  strongly  linear  relationship 
between  the  number  of  rays  which  can  be  traced  in  one 
second  and  the  number  of  DSPs  available  to  do  the  work. 
What  is  not  so  clear  is  how  many  processors  could  be  han¬ 
dled  in  parallel  before  the  host  PC  gets  so  busy  processing 
results  that  it  cannot  keep  up.  Our  measurements  failed  to 
show  any  delay  for  administrative  tasks,  although  the  time 
taken  to  trace  a  ray  bundle  of  29 161  rays  on  the  DSPs, 
about  10  s  per  bundle,  was  easily  measured.  The  negli¬ 
gible  amount  of  overhead  is  reflected  in  the  fact  that  the 
additive  constant  0.21  ±  3.1  in  (46)  is  very  much  smaller 
than  the  coefficient  of  n. 

Since  the  clockQ  routine  can  measure  time  to  the 
nearest  millisecond  and  the  time  for  a  single  DSP  to 
execute  a  bundle  of  work  is  around  10000  ms,  the  ratio 
of  productive  time  to  administrative  overhead  appears  to 
be  at  least  10  000:1.  The  implication  is  that  the  host  PC 
could  service  up  to  10  000  subordinate  DSPs  and  still 
achieve  a  linearly  increasing  number  of  rays  traced  per 
unittime.  With  only  four  DSPs  on  each  Hammerhead  PCI 
board,  that  would  represent  2  500  such  boards,  more  than 
would  ever  fit  on  a  PCI  bus.  Furthermore,  it  is  likely  that 
if  more  DSPs  were  in  service,  they  would  have  to  wait 


occasionally  to  get  more  work,  whereas  waiting  time  was 
minimal  with  just  four  DSPs.  However,  the  very  strong 
linear  relationship  between  the  number  of  DSPs  used  and 
the  number  of  rays  traced  per  second  does  demonstrate 
the  significant  performance  improvement  achievable  by 
using  parallel  processors,  implying  that  this  approach  is 
highly  scalable. 

•  Even  with  four  processors,  the  time  it  takes  to  complete 
the  ray  tracing  exceeds  that  of  the  original  uniprocessor 
solution.  Using  (46),  we  can  see  it  would  take  1.8  proces¬ 
sors  to  match  the  baseline  rate  of  194  y/,s/raytrace. 

•  Table  III  shows  extrapolations  relating  the  number  of 
DSPs  applied  and  the  cost  of  the  necessary  number  of 
PCI  boards  (currently  at  $5595  each)  to  the  number  of 
rays  which  could  be  traced  in  each  second,  along  with  the 
corresponding  amount  of  time  for  tracing  each  ray. 

These  estimates  represent  an  upper  limit.  Many  rays 
will  not  reach  the  detector  and  so  tracing  them  will  ter¬ 
minate  early.  Thus,  the  speeds  achieved  can  be  expected 
to  exceed  these  estimates.  These  extrapolated  estimates 
must  of  course  be  presumed  to  be  ever  less  accurate  as 
the  number  of  DSPs  rise. 

It  would  take  two  Hammerhead  PCI  boards,  therefore, 
to  exceed  the  performance  of  the  original  baseline  pro¬ 
gram.  At  nearly  $12  000  for  such  a  system,  not  counting 
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TABLE  III 

Projection  of  Ray  Tracing  Speeds  With  Additional  DSPs 


Ray 

Time 

traces 

Seconds 

to 

#of 

#  of 

Cost  of 

per 

per  ray 

Completion 

DSPs 

boards 

boards  ($) 

second 

trace 

(days) 

10 

3 

16,785 

28,480 

35  ps 

10.1 

20 

5 

27,975 

57,000 

18  ps 

5.0 

50 

13 

72,735 

142,000 

7.0  ps 

2.0 

Bittware  has  a  new  board  with  eight  250  MHz  ADSP-TS101  TigeiSHARC 
digital  signal  processors  and  a  Xilinx  Virtcx-II  Pro  FPGA.  With  an  increase  in 
processing  speed  by  a  factor  of  6,  this  would  permit  marked  reductions  in 
these  projected  completion  times. 


the  PC  containing  it,  this  is  an  expensive  approach.  How¬ 
ever,  a  combination  of  optimizing  the  code  and  obtaining 
even  faster  results  by  just  adding  more  DSPs  makes  it 
worth  exploring  further.  Also,  the  baseline  performance 
includes  numerous  rays  which  never  reach  thefocal  plane, 
whereas  the  experimental  results  reported  here  include 
only  rays  which  do  make  the  entire  journey. 

•  The  use  of  the  AD21160  for  this  application  is  not  op¬ 
timal.  This  DSP  has  been  optimized  for  typical  signal 
processing  applications  in  which  multiplications  and  ad¬ 
ditions  are  prevalent.  In  contrast,  our  ray  tracing  algo¬ 
rithm  also  requires  repeated  computation  of  reciprocals 
and  square  roots.  The  AD 21160  does  not  provide  hard¬ 
ware  reciprocal  or  square  root  computations.  It  does  have 
instructions  which  give  an  initial  “guess”  at  reciprocals 
and  square  roots  but  additional  software  instructions  are 
required  to  refine  these  guesses.  Two  alternative  means 
which  could  be  used  to  speed  up  the  processing  are  ei¬ 
ther  to  choose  a  different  kind  of  subordinate  processor 
with  more  powerful  floating-point  capabilities  or  offload 
those  computations  from  the  DSP  to  afield  programmable 
gate-array  device.  Of  course,  there  are  some  offsetting  ad¬ 
vantages  to  the  use  of  DSPs:  they  do  not  dissipate  as  much 
heat  as  processors  such  as  the  Intel  Pentium  IV,  and  har¬ 
nessing  them  to  work  in  parallel  is  quite  easy. 

•  There  has  been  very  little  effort  expended  so  far  on  op¬ 
timizing  the  algorithm  executed  in  the  DSPs.  Doing  so 
would  have  the  effect  of  reducing  the  execution  time.  For 
example,  it  might  be  possible  to  reduce  the  use  of  the  op¬ 
erations  of  division  and  square-root  extraction  (which  are 
lengthy  on  theAD21160  DSPs).  For  example,  the  square 
root  operation  entailed  in  (15)  is  avoided  if  we  use  the  ap¬ 
proximation  in  (3).  We  are  now  exploring  a  mesh  solution 
to  obtain  the  intersection  of  a  ray  with  a  conicoid  more  ef¬ 
ficiently. 

•  The  problem  of  tracing  rays  in  a  system  such  as  M  ODIS 
is  dominated  by  computations  as  opposed  to  input/output 
operations.)  This  implies  that  the  computation  could  be 
distributed  to  PCs  connected  to  the  Internet  because  the 
demands  of  communications  would  be  low  compared  to 


the  amount  of  calculation  required.  This  is  the  method 
being  used  in  the  Great  M  ersenne  Prime  Search,6  where 
numerous  volunteers  around  the  world  permit  idle  time 
on  their  computers  to  be  used  to  search  for  the  elusive 
M  ersenne  primes  (only  41  have  ever  been  discovered.) 
On  the  one  hand,  PCs  doing  other  work  simultaneously 
would  have  an  adverse  effect  on  the  overall  completion 
rate.  On  the  other  hand,  devoting  PCs  solely  to  the  ray¬ 
tracing  task  would  be  a  rather  expensive  approach  since 
PCs  are  general-purpose  computers  and  special-purpose 
computers  are  adequate  to  the  task. 


VIII.  Conclusion 

Standard  commercial  ray-tracing  systems  are  not  well  suited 
to  use  with  an  optical  system  I  ike  that  in  M  ODIS  because  of  its 
employment  of  an  attenuator  consisting  of  a  grid  of  pinholes, 
each  of  which  acts  like  a  separate  light  source.  Achieving  a  full 
understanding  of  the  combined  effect  of  such  an  attenuator  can 
be  achieved  by  the  use  of  a  computer  simulation  to  trace  in¬ 
dividual  light  rays  to  their  destination  in  the  focal  plane,  if  in¬ 
deed  they  actually  do  reach  it.  We  implemented  such  a  simulator 
using  a  small  array  of  Analog  Devices  AD21160  digital  signal 
processi  ng  mi croprocessors  programmed  i  n  C .  T  hey  were  under 
the  control  of  a  program  also  written  in  C  and  executing  on  a 
Dell  PC  using  the  Windows  2000  operating  system. 

We  divided  the  ray  tracing  task  into  two  parts.  The  first  of 
these,  performed  by  the  PC,  entailed  locating  the  spots  on  the 
diffuser  plane  reached  by  individual  rays  from  the  sun.  Subse¬ 
quently,  the  cone  of  rays  departing  each  such  spot  was  traced 
by  an  assigned  DSP.  A  2-D  histogram  representing  the  number 
of  rays  striking  each  location  in  thefocal  plane  can  be  accumu¬ 
lated  by  the  PC  program,  permitting  determination  of  the  overall 
illumination  pattern  on  the  screen  without  actually  building  an 
instrument.  This,  in  turn,  permits  the  exploration  of  different 
configurations  of  holes  on  an  attenuator  screen  in  an  attempt  to 
create  a  more  uniform  distribution  of  light  intensities  in  thefocal 
plane. 

Preliminary  experiments  show  a  very  highly  linear  relation¬ 
ship  between  the  speed  of  such  processing  and  the  number  of 
processors  in  the  system.  T radeoffs  between  cost  and  speed  can 
now  be  considered.  Further  work  entails  the  following  nonex- 
haustive  list  of  tasks: 

•  completing  the  program  so  that  all  parts  of  the  simulation 
can  be  performed; 

•  improving  the  efficiency  of  the  ray  tracing  algorithms  by 
optimizing  loops  and  by  minimizing  the  use  of  division 
and  square  root  operations; 

•  considering  the  use  of  alternate  processors  with  more 
efficient  floating  point  capabilities  (for  example,  the 
600-M  Hz  ADSP-TS201S  TigerSHARC); 

•  exploring  the  use  of  processors  distributed  widely  across 
a  network; 

•  investigating  the  limitations  of  expansion  using  the  PCI 
bus;  and 

6M  ersenne  Prime  Search  (2003).  http://www.mersenne.org/prime.htm 
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•  measuring  the  speed  Of  execution  Of  the  algorithm  on  PCs  [13]  W.  H.  Beyer,  Ed.,  CRC  Standard  Mathematical  Tables,  26th  ed.  Boca 
of  various  kinds.  Raton- FL:  CRC' 198L 


Appendix 

Notation 

A  scalar  value  x  is  represented  using  italic  font. 

A  point  P  =  (x,y,z)  is  represented  in  sans-serif  font. 

A  unit-length  vector  a  is  represented  in  bold  italic  with  a  cir¬ 
cumflex  above.  T  he  uni t  vectors  paral  lei  to  the  x-,y-,  and  2-axes 
ar eux,uy,  and  uz,  respectively. 

A  vectorr  =  rxux+ryuy+rzuz  is  represented  in  bold  italic. 
We  frequently  treat  a  point  P  as  if  it  were  equivalent  to  its  vector 
representation  P. 

The  dot  product  of  two  vectors  is  a  scalar  quantity  defined  by 
n  r2  =  IriH^lcos#,  where  6  is  the  angle  separating  the  two 
vectors. 

The  cross  product  of  two  vectors  is  a  vector  whose  magni¬ 
tude  is  given  by  n  x  r2  =  |ri | \r2 |  sin 6,  where  8  is  the  angle 
separating  the  two  vectors.  The  direction  of  the  cross  product  is 
determined  by  the  right-hand  rule.  In  Cartesian  coordinates  the 
cross  product  can  be  computed  as  the  determinant 
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We  represent  the  direction  of  a  ray  by  a  unit-vector  parallel 
to  it.  For  example,  Cj  =  Lux  +  Muy  +  Nuz  =  cos 8xux  + 
cos 8yUy  +  cos 8zuz.  L,M,  and  N  are  the  direction  cosines  of 
the  ray  and  L2  +  M2  +  N2  =  1. 
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those  specialties  to  help  create  the  cockpits  of  the  future  and  to  work  on  the 
unmanned  flight  programs  currently  being  implemented  by  the  Air  Force. 
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